NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Propositional Extraction from Collaborative Naturalistic Dialogues

https://doi.org/10.5281/zenodo.15042102

Venkatesha, Videep; Nath, Abhijnan; Khebour, Ibrahim; Chelle, Avyakta; Bradford, Mariah; Tu, Jingxuan; VanderHoeven, Hannah; Bhalla, Brady; Youngren, Austin; Fitzgerald, Jack; et al (January 2025, Journal of educational data mining)

In the realm of collaborative learning, extracting the beliefs shared within a group is a critical capability to navigate complex tasks. Inherent in this problem is the fact that in naturalistic collaborative discourse, the same propositional content may be expressed in radically different ways. This difficulty is exacerbated when speech overlaps and other communicative modalities are used, as would be the case in a co-situated collaborative task. In this paper, we conduct a comparative methodological analysis of extraction techniques for task-relevant propositions from natural speech dialogues in a challenging shared task setting where participants collaboratively determine the weights of five blocks using only a balance scale. We encode utterances and candidate propositions through language models and compare a cross-encoder method, adapted from coreference research, to a vector similarity baseline. Our cross-encoder approach outperforms both a cosine similarity baseline and zero-shot inference by both the GPT-4 and LLaMA 2 language models, and we establish a novel baseline on this challenging task on two collaborative task datasets---the Weights Task and DeliData---showing the generalizability of our approach. Furthermore, we explore the use of state of the art large language models for data augmentation to enhance performance, extend our examination to transcripts generated by Google's Automatic Speech Recognition system to assess the potential for automating the propositional extraction process in real-time, and introduce a framework for live propositional extraction from natural speech and multimodal signals. This study not only demonstrates the feasibility of detecting collaboration-relevant content in unstructured interactions but also lays the groundwork for employing AI to enhance collaborative problem-solving in classrooms, and other collaborative settings, such as the workforce. Our code may be found at: (https://github.com/csu-signal/PropositionExtraction).
more » « less
Full Text Available
Common Ground Tracking in Multimodal Dialogue

Khebour, Ibrahim; Lai, Kenneth; Bradford, Mariah; Zhu, Yifan; Brutti, Richard; Tam, Christopher; Tu, Jingxuan; Ibarra, Benjamin; Blanchard, Nathaniel; Krishnaswamy, Nikhil; et al (May 2024, ACL)

Within Dialogue Modeling research in AI and NLP, considerable attention has been spent on “dialogue state tracking” (DST), which is the ability to update the representations of the speaker’s needs at each turn in the dialogue by taking into account the past dialogue moves and history. Less studied but just as important to dialogue modeling, however, is “common ground tracking” (CGT), which identiﬁes the shared belief space held by all of the participants in a task-oriented dialogue: the task-relevant propositions all participants accept as true. In this paper we present a method for automatically identifying the current set of shared beliefs and “questions under discussion” (QUDs) of a group with a shared goal. We annotate a dataset of multimodal interactions in a shared physical space with speech transcriptions, prosodic features, gestures, actions, and facets of collaboration, and operationalize these features for use in a deep neural model to predict moves toward construction of common ground. Model outputs cascade into a set of formal closure rules derived from situated evidence and belief axioms and update operations. We empirically assess the contribution of each feature type toward successful construction of common ground relative to ground truth, establishing a benchmark in this novel, challenging task.
more » « less
Full Text Available
Modeling Theory of Mind in Multimodal HCI

https://doi.org/10.1007/978-3-031-60405-8_14

Zhu, Yifan; VanderHoeven, Hannah; Lai, Kenneth; Bradford, Mariah; Tam, Christopher; Khebour, Ibrahim; Brutti, Richard; Krishnaswamy, Nikhil; Pustejovsky, James (January 2024, Springer Nature Switzerland)

Full Text Available
When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task

https://doi.org/10.5334/johd.168

Khebour, Ibrahim; Brutti, Richard; Dey, Indrani; Dickler, Rachel; Sikes, Kelsey; Lai, Kenneth; Bradford, Mariah; Cates, Brittany; Hansen, Paige; Jung, Changsoo; et al (January 2024, Journal of Open Humanities Data)

Full Text Available

Search for: All records